Identifying robust clusters and multi-community nodes by combining top-down and bottom-up approaches to clustering

نویسندگان

  • Christopher Gaiteri
  • Mingming Chen
  • Boleslaw K. Szymanski
  • Konstantin Kuzmin
  • Jierui Xie
  • Changkyu Lee
  • Timothy Blanche
  • Elias Chaibub Neto
  • Su-Chun Huang
  • Thomas J. Grabowski
  • Tara M. Madhyastha
  • Vitalina Komashko
چکیده

Biological functions are often realized by groups of interacting molecules or cells. Membership in these groups may overlap when molecules or cells are reused in multiple functions. Traditional clustering methods assign components to no more than one group, and cannot identify multi-community nodes. Technical noise is common in high-throughput biological datasets and further blurs distinctions between clusters. Together, overlapping nodes and high levels of noise reduce our ability to accurately define clusters in biological datasets and to interpret their biological functions To address these limitations, we designed an algorithm called SpeakEasy, which detects overlapping or non-overlapping communities in commonly studied biological networks. Input to SpeakEasy can be physical networks, such as molecular interactions, or inferred networks, such as gene coexpression networks. The networks can be directed or undirected, and may contain negative links. SpeakEasy combines traditional bottom-up and top-down approaches to clustering, by creating competition between clusters. Nodes that oscillate between multiple clusters in this competition are classified as multi-community nodes. SpeakEasy can quickly process networks with tens of thousands of nodes, quantify the stability of each cluster and select an optimal number of clusters automatically without requiring manual “parameter tuning” for classleading results. Clustering networks derived from gene microarrays, protein affinity, sorted cell populations, electrophysiology, functional magnetic resonance imaging of resting-state brain activity, and synthetic datasets validate the ability of SpeakEasy to facilitate biological insights. For instance, we can identify overlapping co-regulated genes sets, multi-complex proteins and robust changes to the community structure of co-active brain regions in Parkinson disease. These insights rely on a robust overlapping clustering approach that enables a more realistic interpretation of common highthroughput datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identifying robust communities and multi-community nodes by combining top-down and bottom-up approaches to clustering

Biological functions are carried out by groups of interacting molecules, cells or tissues, known as communities. Membership in these communities may overlap when biological components are involved in multiple functions. However, traditional clustering methods detect non-overlapping communities. These detected communities may also be unstable and difficult to replicate, because traditional metho...

متن کامل

Supplemental methods for : Identifying robust communities and multi - community nodes by combining top - down and bottom - up approaches to clustering

clustering performance on real-world networks Traditionally, performance of clustering methods on networks with unknown correct clustering solutions is measured in terms of modularity (“Q”). Modularity measures the number of within-community connections, relative to the number expected at random. This measure has a maximum value of 1, but in practice maximum possible Q-value will be less than 1...

متن کامل

A Comparative Study of Effect of Bottom-up and Top-down Instructional Approaches on EFL Learners’ Vocabulary Recall and Retention

This quasi-experimental study investigated the effect of bottom-up and top-down instructional approaches on English as a foreign language (EFL) vocabulary recall and retention. To this end, 44 high school students from two intact classes were assigned to bottom-up (n = 21) and top-down (n = 23) groups. The participants were exposed to 20 hours of explicit vocabulary instruction during 10 weeks ...

متن کامل

The effect of bottom-up and top-down auditory program training on the development of children's auditory processing skills

Although there have been several previous investigations on the role of auditory training for the development of auditory processing skills, it still remains unknown whether children with auditory processing difficulties can get improved auditory skills after exposure to a multi-modal training experience comprising both visual and tactile stimuli. The present study, therefore, attempted to use ...

متن کامل

The effect of bottom-up and top-down auditory program training on the development of children's auditory processing skills

Although there have been several previous investigations on the role of auditory training for the development of auditory processing skills, it still remains unknown whether children with auditory processing difficulties can get improved auditory skills after exposure to a multi-modal training experience comprising both visual and tactile stimuli. The present study, therefore, attempted to use ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1501.04709  شماره 

صفحات  -

تاریخ انتشار 2015